In late 2019 I released Aethos 1.0, the first iteration of a package to automate common data science techniques. Since then I’ve received great feedback on how to improve Aethos which I’ll introduce here! It will be a lot of code examples to show the power and versatility of the package.

You can view the previous posts about Aethos on my blog!

Intro to Aethos

What is Aethos?

For those new to Aethos, Aethos is a Python library of automated data science techniques and use cases from missing value imputation, NLP pre-processing, feature engineering, data visualization to modelling, model analysis and model deployment.

To see the full capabilities and the rest of the techniques and models you can run, checkout the project page on Github!

Problems with Aethos 1.0

Alot of the problems with the first version of Aethos were related to the usability of the package and its API. The major problems were:

  • Slow import times due to the number of files and coupled packages.

  • Having 2 objects for end to end analysis - Data for transformations and Model for modelling

  • Model object had every model and was not specific to Supervised or Unsupervised problems.

  • Unintuitive API calls for adding new columns to the underlying DataFrames

  • Reporting feature was, well, garbage and becoming redundant with external tools like converting notebooks to pdfs.

  • API had limited use cases. You couldn't just analyze your data, or just analyze a model you trained without Aethos.

  • Aethos and Pandas were not interchangeable and did not work together when transforming data.

What's new in Aethos 2.0

Aethos 2.0 looks to address the intuitiveness and usability of the package to make it easier to use and understand. It also addresses the ability to work with Pandas Dataframes side by side with Aethos.

  • Reduced import time of the package by simplifying and decoupling of the Aethos modules.

  • Only 1 object to analyze, visualize, transform, model and analyze results.

  • Can now specify the type of problem - Classification, Regression or Unsupervised and only see the models specific to those problems.

  • Removed the complexity of adding data to the underlying dataframes through Aethos objects. You can access the underlying dataframes with the x_train and x_test properties.

  • Removed reporting feature.

  • Introduced new objects to support new cases:

    • Analysis: To analyze, visualize and run statistical analysis (t-test, anova, etc.) on your data.

    • Classification: To analyze, visualize, run statistical analysis, transform and impute your data to run classification models.

    • Regression: To analyze, visualize, run statistical analysis, transform and impute your data to run regression models.

    • Unsupervised: To analyze, visualize, run statistical analysis, transform and impute your data to run unsupervised models.

    • ClassificationModelAnalysis: Interpret, analyze and visualize classification model results.

    • RegressionModelAnalysis: Interpret, analyze and visualize regression model results.

    • UnsupervisedModelAnalysis: Interpret, analyze and visualize unsupervised model results.

    • TextModelAnalysis: Interpret, analyze and visualize text model results.

  • Removed dot notation when accessing DataFrame columns.

  • Can now chain methods together.

Note: The model analysis objects get automatically initialized when you run a model with Aethos. They can also be initialized by themselves by supplying a model object, train data and test data.

Examples

!pip install aethos
import pandas as pd
import aethos as at

at.options.track_experiments = True # Enable experiment tracking with MLFlow

To showcase each of the objects let's load in the titanic dataset.

orig_data = pd.read_csv('https://raw.githubusercontent.com/Ashton-Sidhu/aethos/develop/examples/data/train.csv')
orig_data.describe()
PassengerId Survived Pclass Age SibSp Parch Fare
count 891.000000 891.000000 891.000000 714.000000 891.000000 891.000000 891.000000
mean 446.000000 0.383838 2.308642 29.699118 0.523008 0.381594 32.204208
std 257.353842 0.486592 0.836071 14.526497 1.102743 0.806057 49.693429
min 1.000000 0.000000 1.000000 0.420000 0.000000 0.000000 0.000000
25% 223.500000 0.000000 2.000000 20.125000 0.000000 0.000000 7.910400
50% 446.000000 0.000000 3.000000 28.000000 0.000000 0.000000 14.454200
75% 668.500000 1.000000 3.000000 38.000000 1.000000 0.000000 31.000000
max 891.000000 1.000000 3.000000 80.000000 8.000000 6.000000 512.329200

Analysis

The analysis objects is mainly for quick, easy analysis and visualization of data. It doesn't have the ability to run automated cleaning and transformation techniques of Aethos, just visualizations and statistical tests. It also does not split your data, but you do have the option to provide a test set.

df = at.Analysis(orig_data, target='Survived')
df.describe()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
count 891 891 891 NaN NaN 714 891 891 NaN 891 NaN NaN
mean 446 0.383838 2.30864 NaN NaN 29.6991 0.523008 0.381594 NaN 32.2042 NaN NaN
std 257.354 0.486592 0.836071 NaN NaN 14.5265 1.10274 0.806057 NaN 49.6934 NaN NaN
min 1 0 1 NaN NaN 0.42 0 0 NaN 0 NaN NaN
25% 223.5 0 2 NaN NaN 20.125 0 0 NaN 7.9104 NaN NaN
50% 446 0 3 NaN NaN 28 0 0 NaN 14.4542 NaN NaN
75% 668.5 1 3 NaN NaN 38 1 0 NaN 31 NaN NaN
max 891 1 3 NaN NaN 80 8 6 NaN 512.329 NaN NaN
counts 891 891 891 891 891 714 891 891 891 891 204 889
uniques 891 2 3 891 2 88 7 7 681 248 147 3
missing 0 0 0 0 0 177 0 0 0 0 687 2
missing_perc 0% 0% 0% 0% 0% 19.87% 0% 0% 0% 0% 77.10% 0.22%
types numeric bool numeric unique bool numeric numeric numeric categorical numeric categorical categorical
df.missing_values
Train set missing values.
Total Percent
Cabin 687 77.10%
Age 177 19.87%
Embarked 2 0.22%
Fare 0 0.00%
Ticket 0 0.00%
Parch 0 0.00%
SibSp 0 0.00%
Sex 0 0.00%
Name 0 0.00%
Pclass 0 0.00%
Survived 0 0.00%
PassengerId 0 0.00%
df.column_info()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
counts 891 891 891 891 891 714 891 891 891 891 204 889
uniques 891 2 3 891 2 88 7 7 681 248 147 3
missing 0 0 0 0 0 177 0 0 0 0 687 2
missing_perc 0% 0% 0% 0% 0% 19.87% 0% 0% 0% 0% 77.10% 0.22%
types numeric bool numeric unique bool numeric numeric numeric categorical numeric categorical categorical
df.standardize_column_names()
passengerid pclass name sex age sibsp parch ticket fare cabin embarked survived
0 1 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S 0
1 2 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C 1
2 3 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S 1
3 4 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S 1
4 5 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S 0
df.describe_column('fare')
mean                        32.2042
std                         49.6934
variance                    2469.44
min                               0
max                         512.329
mode                           8.05
5%                            7.225
25%                          7.9104
50%                         14.4542
75%                              31
95%                         112.079
iqr                         23.0896
kurtosis                    33.3981
skewness                    4.78732
sum                         28693.9
mad                         28.1637
cv                          1.54307
zeros_num                        15
zeros_perc                    1.68%
deviating_of_mean                20
deviating_of_mean_perc        2.24%
deviating_of_median              53
deviating_of_median_perc      5.95%
top_correlations                   
counts                          891
uniques                         248
missing                           0
missing_perc                     0%
types                       numeric
Name: fare, dtype: object
df.data_report()
Summarize dataset: 100%|██████████| 26/26 [00:04<00:00,  6.31it/s, Completed]                     
Generate report structure: 100%|██████████| 1/1 [00:01<00:00,  1.94s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  1.07it/s]

Easily view the histogram of multiple features.

df.histogram('age', 'fare', hue='survived')

Create a configurable correlation matrix.

df.correlation_matrix(data_labels=True, hide_mirror=True)
<matplotlib.axes._subplots.AxesSubplot at 0x7fd038322748>

We can easily plot the average price each age paid for a ticket.

df.barplot(x='age', y='fare', method='mean', labels={'age': 'Age', 'fare': 'Fare'}, asc=False)

We can also easily view the relationship between age and fair and see the difference between those who survived and who didn't.

df.scatterplot(x='age', y='fare', color='survived', labels={'age': 'Age', 'fare': 'Fare'}, marginal_x='histogram', marginal_y='histogram')

You can visualize other plots like raincloud, violin, box, pairwise, etc. I recommend checking out the examples for more!

One of the big changes is that ability to work with pandas side by side. If you want to transform and work with data solely with Pandas, the Analysis object will reflect those changes. This allows you to use Aethos solely for automated analysis and Pandas for transformations.

To demonstrate this we will make a new boolean feature to see if a passenger was a child using the original pandas dataframe we created

orig_data['is_child'] = (orig_data['age'] < 18).astype(int)
orig_data.head()
passengerid survived pclass name sex age sibsp parch ticket fare cabin embarked is_child
0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S 0
1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C 0
2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S 0
3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S 0
4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S 0

Now let's see it in our Analysis object.

df.head()
passengerid survived pclass name sex age sibsp parch ticket fare cabin embarked is_child
0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S 0
1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C 0
2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S 0
3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S 0
4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S 0
df.boxplot(x='is_child', y='fare', color='survived')

You can still run pandas functions on Aethos objects.

df.nunique()
passengerid    891
survived         2
pclass           3
name           891
sex              2
age             88
sibsp            7
parch            7
ticket         681
fare           248
cabin          147
embarked         3
is_child         2
dtype: int64
df['age'].nunique()
88

New Features

Introduced in Aethos 2.0 are some new analytic techniques.

Predictive Power Score

The predictive power score is an asymmetric, data-type-agnostic score that can detect linear or non-linear relationships between two columns. The score ranges from 0 (no predictive power) to 1 (perfect predictive power). It can be used as an alternative to the correlation (matrix). Credits go to 8080Labs for creating this library and you can get more info here

df.predictive_power(data_labels=True)
<matplotlib.axes._subplots.AxesSubplot at 0x7fd0384a2198>

AutoViz

AutoViz auto visualizes your data and displays key plots based off the characteristics of your data. Credits go to AutoViML for creating this library and you can get more info here.

df.autoviz()
Imported AutoViz_Class version: 0.0.68. Call using: 
    from autoviz.AutoViz_Class import AutoViz_Class
    AV = AutoViz_Class()
    AutoViz(filename, sep=',', depVar='', dfte=None, header=0, verbose=0,
                            lowess=False,chart_format='svg',max_rows_analyzed=150000,max_cols_analyzed=30)
            
To remove previous versions, perform 'pip uninstall autoviz'
Shape of your Data Set: (891, 13)
Classifying variables in data set...
    12 Predictors classified...
        This does not include the Target column(s)
    4 variables removed since they were ID or low-information variables
Total Number of Scatter Plots = 3
Nothing to add Plot not being added
All plots done
Time to run AutoViz (in seconds) = 2.659

Modelling

Aethos 2.0 introduces 3 new model objects: Classification, Regression and Unsupervised. These objects have the same capabilities of the Analysis object, but also can transform your data the same way it did in Aethos 1.0. For those new to Aethos, whenever you use Aethos to apply a transformation, it fits it to the training data and applies it to both the training and test data (in the case of Classification and Regression) to avoid data leakage.

In this post we'll cover the Classification object but the process is the exact same if you were working with a Regression or Unsupervised problem.

df = at.Classification(orig_data, target='Survived', test_split_percentage=.25)

As with Aethos 1.0 if no test data is provided, it is split upon initialization. In Aethos 2.0 it uses stratification for classification problems to split the data to ensure some resemblance of class balance.

Warning: Earlier we showed the ability to alter the original dataframe and have it reflected in the Aethos object. This is NOT the case if you do not provide a test set for the Classification and Regression object.
df.describe()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
count 668 668 668 NaN NaN 533 668 668 NaN 668 NaN NaN
mean 441.913 0.383234 2.29192 NaN NaN 29.4192 0.510479 0.377246 NaN 32.4659 NaN NaN
std 260.048 0.486539 0.841285 NaN NaN 14.7713 1.08757 0.781087 NaN 51.5116 NaN NaN
min 1 0 1 NaN NaN 0.42 0 0 NaN 0 NaN NaN
25% 214.75 0 1.75 NaN NaN 20 0 0 NaN 7.925 NaN NaN
50% 450.5 0 3 NaN NaN 28 0 0 NaN 14.4542 NaN NaN
75% 668.25 1 3 NaN NaN 38 1 0 NaN 31.275 NaN NaN
max 891 1 3 NaN NaN 80 8 5 NaN 512.329 NaN NaN
counts 668 668 668 668 668 533 668 668 668 668 160 666
uniques 668 2 3 668 2 82 7 6 545 216 121 3
missing 0 0 0 0 0 135 0 0 0 0 508 2
missing_perc 0% 0% 0% 0% 0% 20.21% 0% 0% 0% 0% 76.05% 0.30%
types numeric bool numeric unique bool numeric numeric numeric categorical numeric categorical categorical
df.x_train.head()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 482 0 2 Frost, Mr. Anthony Wood "Archie" male NaN 0 0 239854 0.0000 NaN S
1 828 1 2 Mallet, Master. Andre male 1.0 0 2 S.C./PARIS 2079 37.0042 NaN C
2 562 0 3 Sivic, Mr. Husein male 40.0 0 0 349251 7.8958 NaN S
3 865 0 2 Gill, Mr. John William male 24.0 0 0 233866 13.0000 NaN S
4 283 0 3 de Pelsmaeker, Mr. Alfons male 16.0 0 0 345778 9.5000 NaN S
df.x_test.head()
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 187 1 3 O'Brien, Mrs. Thomas (Johanna "Hannah" Godfrey) female NaN 1 0 370365 15.5000 NaN Q
1 321 0 3 Dennis, Mr. Samuel male 22.0 0 0 A/5 21172 7.2500 NaN S
2 379 0 3 Betros, Mr. Tannous male 20.0 0 0 2648 4.0125 NaN C
3 698 1 3 Mullens, Miss. Katherine "Katie" female NaN 0 0 35852 7.7333 NaN Q
4 509 0 3 Olsen, Mr. Henry Margido male 28.0 0 0 C 4001 22.5250 NaN S
df.missing_values
Train set missing values.
Total Percent
Cabin 508 76.05%
Age 135 20.21%
Embarked 2 0.30%
Fare 0 0.00%
Ticket 0 0.00%
Parch 0 0.00%
SibSp 0 0.00%
Sex 0 0.00%
Name 0 0.00%
Pclass 0 0.00%
Survived 0 0.00%
PassengerId 0 0.00%
Test set missing values.
Total Percent
Cabin 179 80.27%
Age 42 18.83%
Embarked 0 0.00%
Fare 0 0.00%
Ticket 0 0.00%
Parch 0 0.00%
SibSp 0 0.00%
Sex 0 0.00%
Name 0 0.00%
Pclass 0 0.00%
Survived 0 0.00%
PassengerId 0 0.00%

Tip: Aethos comes with a checklist to help give you reminders when cleaning, analyzing and transforming your data!
df.checklist()
df.standardize_column_names()
passengerid pclass name sex age sibsp parch ticket fare cabin embarked survived
0 482 2 Frost, Mr. Anthony Wood "Archie" male NaN 0 0 239854 0.0000 NaN S 0
1 828 2 Mallet, Master. Andre male 1.0 0 2 S.C./PARIS 2079 37.0042 NaN C 1
2 562 3 Sivic, Mr. Husein male 40.0 0 0 349251 7.8958 NaN S 0
3 865 2 Gill, Mr. John William male 24.0 0 0 233866 13.0000 NaN S 0
4 283 3 de Pelsmaeker, Mr. Alfons male 16.0 0 0 345778 9.5000 NaN S 0

Since this is an overview, let's select the columns were going to work with and drop the ones we're not going to use.

df.drop(keep=['survived', 'pclass', 'sex', 'age', 'fare', 'embarked'])
pclass sex age fare embarked survived
0 2 male NaN 0.0000 S 0
1 2 male 1.0 37.0042 C 1
2 3 male 40.0 7.8958 S 0
3 2 male 24.0 13.0000 S 0
4 3 male 16.0 9.5000 S 0

Let's chain our transformations together. Remember our transformations will be fit to the training data and automatically transform our test data!

is_child = lambda df: 1 if df['age'] < 18 else 0

df.replace_missing_median('age') \
  .replace_missing_mostcommon('embarked') \
  .onehot_encode('sex', 'pclass', 'embarked', keep_col=False) \
  .apply(is_child, 'is_child') \
  .normalize_numeric('fare', 'age')
Pandas Apply: 100%|██████████| 668/668 [00:00<00:00, 77148.31it/s]
Pandas Apply: 100%|██████████| 223/223 [00:00<00:00, 57466.81it/s]
sex_female sex_male pclass_1 pclass_2 pclass_3 embarked_C embarked_Q embarked_S is_child fare age survived
0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0 0.000000 0.346569 0
1 0.0 1.0 0.0 1.0 0.0 1.0 0.0 0.0 1 0.072227 0.007288 1
2 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 0 0.015412 0.497361 0
3 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0 0.025374 0.296306 0
4 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 1 0.018543 0.195778 0
df.x_train.head()
sex_female sex_male pclass_1 pclass_2 pclass_3 embarked_C embarked_Q embarked_S is_child fare age survived
0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0 0.000000 0.346569 0
1 0.0 1.0 0.0 1.0 0.0 1.0 0.0 0.0 1 0.072227 0.007288 1
2 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 0 0.015412 0.497361 0
3 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 0 0.025374 0.296306 0
4 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 1 0.018543 0.195778 0
df.x_test.head()
sex_female sex_male pclass_1 pclass_2 pclass_3 embarked_C embarked_Q embarked_S is_child fare age survived
0 1.0 0.0 0.0 0.0 1.0 0.0 1.0 0.0 0 0.030254 0.346569 1
1 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 0 0.014151 0.271174 0
2 0.0 1.0 0.0 0.0 1.0 1.0 0.0 0.0 0 0.007832 0.246042 0
3 1.0 0.0 0.0 0.0 1.0 0.0 1.0 0.0 0 0.015094 0.346569 1
4 0.0 1.0 0.0 0.0 1.0 0.0 0.0 1.0 0 0.043966 0.346569 0

Now let's train a Logistic Regression model.

We'll use gridsearch and it will automatically return the best model. We'll use Stratified K-fold for the Cross Validation technique during grid search.

gs_params = {
    "C": [0.1, 0.5, 1],
    "max_iter": [100, 1000]
}

lr = df.LogisticRegression(
    cv_type='strat-kfold',
    gridsearch=gs_params,
    random_state=42
)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
Gridsearching with the following parameters: {'C': [0.1, 0.5, 1], 'max_iter': [100, 1000]}
Fitting 5 folds for each of 6 candidates, totalling 30 fits
[Parallel(n_jobs=1)]: Done  30 out of  30 | elapsed:    0.2s finished
LogisticRegression(C=1, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='auto', n_jobs=None, penalty='l2',
                   random_state=42, solver='lbfgs', tol=0.0001, verbose=0,
                   warm_start=False)

Once a model is trained a ModelAnalysis object is returned which allows us to analyze, interpret and visualize our model results. Included is a list to help you debug your model if it’s overfit or underfit!

df.help_debug()

You can quickly cross validate any model by calling cross_validate on the resulting ModelAnalysis object. It will display the mean score across all folds and a learning curve.

For classification problems the default cross validation method is Stratified K-Fold. This allows to maintain some form of class balance, while for regression, the default is K-Fold.

lr.cross_validate()
lr.metrics() # Note this displays the results on the test data.
log_reg Description
Accuracy 0.780 Measures how many observations, both positive and negative, were correctly classified.
Balanced Accuracy 0.774 The balanced accuracy in binary and multiclass classification problems to deal with imbalanced datasets. It is defined as the average of recall obtained on each class.
Average Precision 0.822 Summarizes a precision-recall curve as the weighted mean of precisions achieved at each threshold
ROC AUC 0.853 Shows how good at ranking predictions your model is. It tells you what is the probability that a randomly chosen positive instance is ranked higher than a randomly chosen negative instance.
Zero One Loss 0.220 Fraction of misclassifications.
Precision 0.703 It measures how many observations predicted as positive are positive. Good to use when False Positives are costly.
Recall 0.744 It measures how many observations out of all positive observations have we classified as positive. Good to use when catching call positive occurences, usually at the cost of false positive.
Matthews Correlation Coefficient 0.542 It’s a correlation between predicted classes and ground truth.
Log Loss 0.450 Difference between ground truth and predicted score for every observation and average those errors over all observations.
Jaccard 0.566 Defined as the size of the intersection divided by the size of the union of two label sets, is used to compare set of predicted labels for a sample to the corresponding set of true labels.
Hinge Loss 0.511 Computes the average distance between the model and the data using hinge loss, a one-sided metric that considers only prediction errors.
Hamming Loss 0.220 The Hamming loss is the fraction of labels that are incorrectly predicted.
F-Beta 0.711 It’s the harmonic mean between precision and recall, with an emphasis on one or the other. Takes into account both metrics, good for imbalanced problems (spam, fraud, etc.).
F1 0.723 It’s the harmonic mean between precision and recall. Takes into account both metrics, good for imbalanced problems (spam, fraud, etc.).
Cohen Kappa 0.541 Cohen Kappa tells you how much better is your model over the random classifier that predicts based on class frequencies. Works well for imbalanced problems.
Brier Loss 0.220 It is a measure of how far your predictions lie from the true values. Basically, it is a mean square error in the probability space.

Manual vs Automated

Lets's manually train a Logistic Regression and view and verify the results.

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, roc_auc_score, precision_score

X_train = df.x_train.drop("survived", axis=1)
X_test = df.x_test.drop("survived", axis=1)

y_train = df.x_train["survived"]
y_test = df.x_test["survived"]

clf = LogisticRegression(C=1, max_iter=100, random_state=42).fit(X_train, y_train)
y_pred = clf.predict(X_test)

print(f"Accuracy: {accuracy_score(y_test, y_pred).round(3)}")
print(f"AUC: {roc_auc_score(y_test, clf.decision_function(X_test)).round(3)}")
print(f"Precision: {precision_score(y_test, y_pred).round(3)}")
Accuracy: 0.78
AUC: 0.853
Precision: 0.703

Results are the same!

Model Analysis

Similar to Modelling, Aethos 2.0 introduces 4 model analysis objects: ClassificationModelAnalysis, RegressionModelAnalysis, UnsupervisedModelAnalysis and TextModelAnalysis. In Aethos 2.0 they can be initialized in 2 ways:

  • Result of training a model using Aethos

  • Initializing it on your own by providing a Model object, the train data used by the model and the test data to evaluate model performance (for Regression and Classification).

Similar to the Model objects we're going to explore the ClassificationModelAnalysis object but the process would be the same for regression, unsupervised and text model analysis.

Initialzed from Aethos

To start, we'll pick off from where we left off with modelling and view the metrics for our Logistic Regression model.

type(lr)
aethos.model_analysis.classification_model_analysis.ClassificationModelAnalysis
lr.metrics()
log_reg Description
Accuracy 0.780 Measures how many observations, both positive and negative, were correctly classified.
Balanced Accuracy 0.774 The balanced accuracy in binary and multiclass classification problems to deal with imbalanced datasets. It is defined as the average of recall obtained on each class.
Average Precision 0.822 Summarizes a precision-recall curve as the weighted mean of precisions achieved at each threshold
ROC AUC 0.853 Shows how good at ranking predictions your model is. It tells you what is the probability that a randomly chosen positive instance is ranked higher than a randomly chosen negative instance.
Zero One Loss 0.220 Fraction of misclassifications.
Precision 0.703 It measures how many observations predicted as positive are positive. Good to use when False Positives are costly.
Recall 0.744 It measures how many observations out of all positive observations have we classified as positive. Good to use when catching call positive occurences, usually at the cost of false positive.
Matthews Correlation Coefficient 0.542 It’s a correlation between predicted classes and ground truth.
Log Loss 0.450 Difference between ground truth and predicted score for every observation and average those errors over all observations.
Jaccard 0.566 Defined as the size of the intersection divided by the size of the union of two label sets, is used to compare set of predicted labels for a sample to the corresponding set of true labels.
Hinge Loss 0.511 Computes the average distance between the model and the data using hinge loss, a one-sided metric that considers only prediction errors.
Hamming Loss 0.220 The Hamming loss is the fraction of labels that are incorrectly predicted.
F-Beta 0.711 It’s the harmonic mean between precision and recall, with an emphasis on one or the other. Takes into account both metrics, good for imbalanced problems (spam, fraud, etc.).
F1 0.723 It’s the harmonic mean between precision and recall. Takes into account both metrics, good for imbalanced problems (spam, fraud, etc.).
Cohen Kappa 0.541 Cohen Kappa tells you how much better is your model over the random classifier that predicts based on class frequencies. Works well for imbalanced problems.
Brier Loss 0.220 It is a measure of how far your predictions lie from the true values. Basically, it is a mean square error in the probability space.

You can also set project metrics based off your business requirements.

at.options.project_metrics = ["Accuracy", "ROC AUC", "Precision"]
lr.metrics()
log_reg Description
Accuracy 0.780 Measures how many observations, both positive and negative, were correctly classified.
ROC AUC 0.853 Shows how good at ranking predictions your model is. It tells you what is the probability that a randomly chosen positive instance is ranked higher than a randomly chosen negative instance.
Precision 0.703 It measures how many observations predicted as positive are positive. Good to use when False Positives are costly.

If you want to just view individual metrics, there are functions for those to!

lr.fbeta(beta=0.4999)
0.7111085827756309

You can analyze any models results with just one line of code:

  • Metrics
  • Classification Report
  • Confusion Matrix
  • Decision Boundaries
  • Decision Plots
  • Dependence Plots
  • Force Plots
  • LIME Plots
  • Morris Sensitivity
  • Model Weights
  • Summary Plot
  • RoC Curve
  • Individual metrics

And this is only for Classification Models, each type of problem has their own set of ModelAnalysis functions!

lr.classification_report()
              precision    recall  f1-score   support

           0       0.83      0.80      0.82       137
           1       0.70      0.74      0.72        86

    accuracy                           0.78       223
   macro avg       0.77      0.77      0.77       223
weighted avg       0.78      0.78      0.78       223

lr.confusion_matrix()

You can supply features from your train set to the dependency plot otherwise it will just use the first 2 features in your model. Under the hood it uses YellowBricks Decision Boundary visualizer to create the visualizations.

lr.decision_boundary('age', 'fare')
lr.decision_boundary()

Included are also automated SHAP use cases to interpret your model!

lr.decision_plot()
<shap.plots.decision.DecisionPlotResult at 0x7f521b43abd0>
lr.dependence_plot('age')
lr.force_plot()
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
lr.interpret_model()
100%|██████████| 111/111 [00:00<00:00, 617.89it/s]
Open in new window<iframe src="http://127.0.0.1:7664/139990620277328/" width=100% height=800 frameBorder="0"></iframe>

View the highest weighted features in your model.

lr.model_weights()
age : -1.64
sex_male : -1.23
sex_female : 1.23
pclass_3 : -1.06
pclass_1 : 1.05
is_child : 0.56
fare : 0.46
embarked_S : -0.33
embarked_C : 0.20
embarked_Q : 0.13
pclass_2 : 0.00

Easily plot an RoC curve.

lr.roc_curve()
<sklearn.metrics._plot.roc_curve.RocCurveDisplay at 0x7f521b2e1f90>
lr.summary_plot()

Finally we can generate the files to deploy our model through a RESTful API using FastAPI, Gunicorn and Docker!

lr.to_service('aethos2')
Deployment files can be found at /home/sidhu/.aethos/projects/aethos2.

To run:
	docker build -t `image_name` ./
	docker run -d --name `container_name` -p `port_num`:80 `image_name`

User Initialization

If we manually trained a model like we did earlier in the notebook and wanted to use Aethos's model analysis capabilties we can!

lr = at.ClassificationModelAnalysis(
    clf,
    df.x_train,
    df.x_test,
    target='survived',
    model_name='log_reg'
)

Note: x_train and x_test datasets must have the target variable as part of the DataFrame.

You will receive the same results as above, thus giving you the ability to manually transform your data, train your model and use Aethos to interpret the results. I've included them below for verification.

lr.metrics()
log_reg Description
Accuracy 0.780 Measures how many observations, both positive and negative, were correctly classified.
ROC AUC 0.853 Shows how good at ranking predictions your model is. It tells you what is the probability that a randomly chosen positive instance is ranked higher than a randomly chosen negative instance.
Precision 0.703 It measures how many observations predicted as positive are positive. Good to use when False Positives are costly.
lr.decision_boundary('age', 'fare')
lr.decision_boundary()
lr.decision_plot()
<shap.plots.decision.DecisionPlotResult at 0x7f521cb2ed10>
lr.dependence_plot('age')
lr.force_plot()
Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.
lr.interpret_model()
100%|██████████| 111/111 [00:00<00:00, 727.94it/s]
Open in new window<iframe src="http://127.0.0.1:7664/139990657407168/" width=100% height=800 frameBorder="0"></iframe>
lr.model_weights()
age : -1.64
sex_male : -1.23
sex_female : 1.23
pclass_3 : -1.06
pclass_1 : 1.05
is_child : 0.56
fare : 0.46
embarked_S : -0.33
embarked_C : 0.20
embarked_Q : 0.13
pclass_2 : 0.00
lr.roc_curve()
<sklearn.metrics._plot.roc_curve.RocCurveDisplay at 0x7f521d910c90>
lr.summary_plot()b
lr.to_service('aethos2')
Deployment files can be found at /home/sidhu/.aethos/projects/aethos2.

To run:
	docker build -t `image_name` ./
	docker run -d --name `container_name` -p `port_num`:80 `image_name`

Feedback

I encourage all feedback about this post or Aethos. You can message me on twitter or e-mail me at sidhuashton@gmail.com.

Any bug or feature requests, please create an issue on the Github repo. I welcome all feature requests and any contributions. This project is a great starter if you’re looking to contribute to an open source project — you can always message me if you need assistance getting started.